fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923
fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923davison wants to merge 1 commit intopaperclipai:masterfrom
Conversation
Greptile SummaryThis PR adds runtime UID/GID remapping to the Docker image so that the The three issues raised in the previous review rounds have all been addressed cleanly:
Two minor P2 suggestions remain:
Confidence Score: 5/5Safe to merge — all previously raised blocking issues are resolved; remaining findings are minor style and hardening suggestions. All three P1 issues from prior rounds (ARG propagation, unconditional chown, broken primary GID) are correctly fixed in the latest push. The two remaining comments are P2 (absolute ENTRYPOINT path convention and a missing zero-UID guard), neither of which causes incorrect behaviour in normal use. The core UID/GID remapping logic is sound and follows the standard gosu pattern. No files require special attention. Important Files Changed
Prompt To Fix All With AIThis is a comment left during a code review.
Path: Dockerfile
Line: 73
Comment:
**ENTRYPOINT should use absolute path**
`ENTRYPOINT ["docker-entrypoint.sh"]` relies on `PATH` lookup at runtime. While `/usr/local/bin` is in `PATH` for this image and it works in practice, the canonical Docker convention (and what linters like Hadolint enforce) is to use the full absolute path in exec-form `ENTRYPOINT`. Using a relative name is fragile if anyone customises the image's `PATH`.
```suggestion
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 5-6
Comment:
**No guard against `PUID=0` / `PGID=0`**
If an operator accidentally (or deliberately) sets `USER_UID=0` or `USER_GID=0`, `usermod` will remap `node` to UID/GID 0. `exec gosu node` then starts the server process as root — the privilege-drop silently becomes a no-op. This defeats the security model of the whole entrypoint.
The standard pattern (used by linuxserver.io and Bitnami images) is to validate and reject 0 early:
```sh
if [ "$PUID" = "0" ] || [ "$PGID" = "0" ]; then
echo "ERROR: USER_UID and USER_GID must not be 0 (root). Refusing to start." >&2
exit 1
fi
```
Adding this check after lines 5–6 makes the failure loud and obvious rather than silent.
How can I resolve this? If you propose a fix, please make it concise.Reviews (4): Last reviewed commit: "fix: support host UID/GID mapping for vo..." | Re-trigger Greptile |
Dockerfile
Outdated
| USER_UID=1000 \ | ||
| USER_GID=1000 \ |
There was a problem hiding this comment.
Build-arg
USER_UID/USER_GID not propagated to runtime ENV
ARG values are scoped to the stage they are declared in. ARG USER_UID=1000 / ARG USER_GID=1000 are declared in the base stage, but the production stage never re-declares them. As a result, the ENV block hardcodes the literal 1000 rather than forwarding the build-arg value.
Concrete failure scenario: docker build --build-arg USER_UID=1001 -t paperclip-test .
basestage:usermod -u 1001 noderuns correctly —nodeuser is UID 1001 in the image layer.productionstage:ENV USER_UID=1000— runtime default is still 1000.- Container start: entrypoint reads
PUID=1000, seesid -u nodeis 1001, and runsusermod -o -u 1000 node— undoing the build-time baking entirely.
The fix is to re-declare the ARGs in the production stage and use them in the ENV block:
ARG USER_UID=1000
ARG USER_GID=1000
ENV NODE_ENV=production \
...
USER_UID=${USER_UID} \
USER_GID=${USER_GID} \
...
Prompt To Fix With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 61-62
Comment:
**Build-arg `USER_UID`/`USER_GID` not propagated to runtime `ENV`**
`ARG` values are scoped to the stage they are declared in. `ARG USER_UID=1000` / `ARG USER_GID=1000` are declared in the `base` stage, but the `production` stage never re-declares them. As a result, the `ENV` block hardcodes the literal `1000` rather than forwarding the build-arg value.
**Concrete failure scenario:** `docker build --build-arg USER_UID=1001 -t paperclip-test .`
1. `base` stage: `usermod -u 1001 node` runs correctly — `node` user is UID 1001 in the image layer.
2. `production` stage: `ENV USER_UID=1000` — runtime default is still **1000**.
3. Container start: entrypoint reads `PUID=1000`, sees `id -u node` is 1001, and runs `usermod -o -u 1000 node` — **undoing the build-time baking entirely**.
The fix is to re-declare the ARGs in the `production` stage and use them in the `ENV` block:
```
ARG USER_UID=1000
ARG USER_GID=1000
ENV NODE_ENV=production \
...
USER_UID=${USER_UID} \
USER_GID=${USER_GID} \
...
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (268a75e). Re-declared ARG USER_UID=1000 / ARG USER_GID=1000 at the top of the production stage and changed the ENV block to USER_UID=${USER_UID} / USER_GID=${USER_GID} so the build-arg value is correctly forwarded to the runtime environment.
docker-entrypoint.sh
Outdated
| fi | ||
|
|
||
| # Ensure correct ownership of the data volume | ||
| chown -R node:node /paperclip |
There was a problem hiding this comment.
chown -R runs unconditionally on every container start
chown -R node:node /paperclip is not guarded behind the UID/GID-change branches — it executes on every startup, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."
For an AI agent orchestration platform, /paperclip accumulates persistent state (logs, databases, agent artefacts). A recursive chown over a large directory tree on every container restart introduces measurable startup latency that will grow over time.
The standard remedy is to move chown inside the branches so it only runs when a remap actually happened:
changed=0
if [ "$(id -u node)" -ne "$PUID" ]; then
echo "Updating node UID to $PUID"
usermod -o -u "$PUID" node
changed=1
fi
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
fi
if [ "$changed" = "1" ]; then
chown -R node:node /paperclip
fiPrompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 20
Comment:
**`chown -R` runs unconditionally on every container start**
`chown -R node:node /paperclip` is not guarded behind the UID/GID-change branches — it executes on **every startup**, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."
For an AI agent orchestration platform, `/paperclip` accumulates persistent state (logs, databases, agent artefacts). A recursive `chown` over a large directory tree on every container restart introduces measurable startup latency that will grow over time.
The standard remedy is to move `chown` inside the branches so it only runs when a remap actually happened:
```sh
changed=0
if [ "$(id -u node)" -ne "$PUID" ]; then
echo "Updating node UID to $PUID"
usermod -o -u "$PUID" node
changed=1
fi
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
fi
if [ "$changed" = "1" ]; then
chown -R node:node /paperclip
fi
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (268a75e). Added a changed=0 flag; chown -R node:node /paperclip now only runs when changed=1 (i.e. when usermod or groupmod actually executed). Default-UID containers skip the chown entirely and exec immediately.
3b88682 to
268a75e
Compare
|
Updated the PR with fixes for both P1 Greptile findings:
Both fixes are in commit 268a75e. |
1 similar comment
|
Updated the PR with fixes for both P1 Greptile findings:
Both fixes are in commit 268a75e. |
|
|
||
| if [ "$(id -g node)" -ne "$PGID" ]; then | ||
| echo "Updating node GID to $PGID" | ||
| groupmod -o -g "$PGID" node | ||
| changed=1 |
There was a problem hiding this comment.
GID remapping is silently broken
groupmod -o -g "$PGID" node updates the GID record in /etc/group but does not update the user's primary GID field stored in /etc/passwd. When exec gosu node "$@" runs, gosu calls getpwnam("node") which reads the primary GID from /etc/passwd — still the original value. The process therefore always starts with the old GID, regardless of what USER_GID is set to.
Concrete consequence: with USER_GID=1001:
groupmod -o -g 1001 node→/etc/groupnow saysnode:x:1001:chown -R node:node /paperclip→ sets/paperclipgid to 1001 (resolving via/etc/group)exec gosu node→ reads/etc/passwd, primary gid = 1000 (unchanged) → process runs as gid=1000- New files created by the server inside
/paperclipare owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them
The same issue affects the build-time groupmod step in the Dockerfile (line 11).
The fix is to call usermod -g "$PGID" node after groupmod, which updates the primary GID entry in /etc/passwd:
| if [ "$(id -g node)" -ne "$PGID" ]; then | |
| echo "Updating node GID to $PGID" | |
| groupmod -o -g "$PGID" node | |
| changed=1 | |
| if [ "$(id -g node)" -ne "$PGID" ]; then | |
| echo "Updating node GID to $PGID" | |
| groupmod -o -g "$PGID" node | |
| usermod -g "$PGID" node | |
| changed=1 | |
| fi |
And in the Dockerfile build-time block:
RUN usermod -u $USER_UID --non-unique node \
&& groupmod -g $USER_GID --non-unique node \
&& usermod -g $USER_GID nodePrompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 17-21
Comment:
**GID remapping is silently broken**
`groupmod -o -g "$PGID" node` updates the GID record in `/etc/group` but does **not** update the user's primary GID field stored in `/etc/passwd`. When `exec gosu node "$@"` runs, `gosu` calls `getpwnam("node")` which reads the primary GID from `/etc/passwd` — still the original value. The process therefore always starts with the old GID, regardless of what `USER_GID` is set to.
Concrete consequence: with `USER_GID=1001`:
1. `groupmod -o -g 1001 node` → `/etc/group` now says `node:x:1001:`
2. `chown -R node:node /paperclip` → sets `/paperclip` gid to 1001 (resolving via `/etc/group`)
3. `exec gosu node` → reads `/etc/passwd`, primary gid = 1000 (unchanged) → process runs as gid=1000
4. New files created by the server inside `/paperclip` are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them
The same issue affects the build-time `groupmod` step in the `Dockerfile` (line 11).
The fix is to call `usermod -g "$PGID" node` after `groupmod`, which updates the primary GID entry in `/etc/passwd`:
```suggestion
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
usermod -g "$PGID" node
changed=1
fi
```
And in the `Dockerfile` build-time block:
```dockerfile
RUN usermod -u $USER_UID --non-unique node \
&& groupmod -g $USER_GID --non-unique node \
&& usermod -g $USER_GID node
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (a0e94b4). Added usermod -g "$PGID" node after groupmod in docker-entrypoint.sh, and && usermod -g $USER_GID node after groupmod in the base stage Dockerfile. This updates the primary GID field in /etc/passwd so gosu node resolves the correct GID from the start.
268a75e to
a0e94b4
Compare
|
This (and the 2 queued up behind it awaiting this to be merged) will eventually address a number of open issues too:
The first one here should be a simple quality of life improvement for docker/podman users and has no effect on any functional part of the codebase. Greptile score is 5/5, could it be merged @cryppadotta 🙏🏻 ? |
Add USER_UID/USER_GID build args and gosu-based entrypoint to remap the node user's UID/GID at container startup, avoiding permission errors when mounting a host directory as /paperclip. - Build-time: usermod/groupmod adjust the node user to match build args - Runtime: docker-entrypoint.sh re-adjusts UID/GID from env vars and drops privileges via gosu before starting the server Co-Authored-By: Paperclip <noreply@paperclip.ing>
a0e94b4 to
3349f94
Compare
|
@devinfoley @cryppadotta - any chance this can either be merged or could someone possibly let me know why it can't be? Getting no feedback here or discord. Just trying to help out a bit. I can obviously publish from my fork, but I'm not sure that really helps the wider community much 😞 There are others queued up behind this that might add material benefit, but trying to keep the PRs small and focused for reviews |
Thinking Path
What Changed
DockerfileARG USER_UID=1000andARG USER_GID=1000build args (defaults to 1000 — no behaviour change for existing deployments)gosuto the base apt installusermod/groupmodat build time to pre-bake the UID/GID into the image layerUSER_UIDandUSER_GIDto theENVblock so they are available at runtimeUSER nodedirective withENTRYPOINT ["docker-entrypoint.sh"]docker-entrypoint.shto/usr/local/bin/and makes it executabledocker-entrypoint.sh(new file)USER_UID/USER_GIDenv vars at startupusermod/groupmodif the running UID/GID differ from the node user (handles runtime override without rebuild)chown -R node:node /paperclipto correct volume ownershipnodeuser viaexec gosu node "$@"— no persistent root processNote on customisation: A question that keeps coming up is how to add tools, MCP servers, or extra packages to the image (e.g. for specific agent adapters). The cleanest pattern is to build a downstream image using Paperclip as a base:
This PR keeps the upstream image minimal and focused. Documenting this pattern properly is a good follow-up.
Verification
Build and run with a non-default UID:
Default build (no args): identical behaviour to the previous image — server runs as
node(UID 1000), entrypoint is a no-op remap and proceeds immediately toexec gosu node.Risks
gosudrop — this is intentional and necessary forusermod/chown. The server process itself never runs as root. This is the standard pattern for this use case (Bitnami images, LinuxServer.io images, official Redis/Postgres images all do the same).gosuadds a small dependency (~1 MB).gosuis a well-known, audited privilege-drop tool that avoids the TTY and signal-handling issues ofsu/sudoin containers.Checklist